\chapter{Background}
This chapter introduces technologies and concepts which are necessary for the
understanding of the design and the realization of the Muen kernel.  First we
present SPARK, the programming language chosen for the implementation.  After
that, a short description of the Intel/PC architecture is given, followed by an
introduction into virtualization. The concept of the separation kernel is
presented and the main motivation and goals of this work is laid out. The
chapter concludes with an overview of related projects.

\section{SPARK}\label{sec:spark}
SPARK\index{SPARK} is a precisely defined high-level programming language
designed for implementing high integrity systems. It is based on
Ada\index{Ada}, which is itself a programming language with a strong focus on
security and safety.

The SPARK language is a subset of Ada with additional features inserted as
annotations in the form of Ada comments. Since compilers ignore comments and
SPARK is a true subset of Ada, any correct SPARK program is a correct Ada
program and can be compiled using existing Ada compilers. One such compiler is
GNAT, which is part of the GNU compiler collection (GCC\index{GCC}) \cite{gcc}.

However, since annotations are an integral part of SPARK, it would be
misleading to simply consider SPARK a constrained version of Ada. SPARK should
be viewed as a programming language in its own right. The following list
summarizes the Ada restrictions imposed by SPARK:

\begin{itemize}
	\item No access types (pointers)
	\item No recursion
	\item No exceptions
	\item No goto
	\item No anonymous and aliased types
	\item No functions with side-effects
	\item No dynamic array bounds
	\item No dynamic memory allocation
	\item No dynamic dispatching
\end{itemize}

Annotations are processed by SPARK tools. These tools perform static analysis
of source code. The annotations allow the tools to do data and information flow
analysis as well as prove the absence of runtime errors\index{runtime errors}.
This means that SPARK tools allow one to formally verify that a given program
is free of errors such as division by zero, out-of-bounds array access etc. By
using SPARK the following types of errors can be proven to be absent from the
code:

\begin{itemize}
	\item Incorrect indexing of arrays
	\item Overflows
	\item Division by zero
	\item Type range violations
	\item Memory exhaustion
	\item Dangling pointers
\end{itemize}

On top of these properties, the usage of pre-/post-conditions and assertions
allow to prove additional functional properties. A proof of
(partial\footnote{Termination cannot be shown}) correctness of SPARK programs is
achievable. This allows one to formally show the correspondence of an
implementation with a formal specification.

It is also interesting to note, that SPARK has support for tasking in the form
of a language profile\footnote{Restricted subset of the programming language}
called RavenSPARK \cite{RavenSPARK}.

SPARK is a mature technology and has garnered quite some interest since it has
been successfully used in several industrial projects
\cite{Chapman:2000:IES:369264.369270}. It is primarily employed in the field of
avionics, space, medical systems and the defense industry.

\subsection{Design Rationale}
The main driving factors behind the design of SPARK as stated by the language
reference manual \cite{SPARK} are briefly described here:

\begin{description}
	\item[Logical soundness] \hfill \\
		The language must not contain any ambiguities and must be formally
		defined.
	\item[Complexity of formal language definition] \hfill \\
		The language must be simple to specify formally.
	\item[Expressive power] \hfill \\
		The language must be expressive enough to implement complex systems.
	\item[Security] \hfill \\
		It must be possible to avoid entering error conditions or undefined
		state at runtime by static program analysis with a reasonable effort.
	\item[Verifiability] \hfill \\
		Program verification must be modular in order to be tractable for
		industrial scale projects.
	\item[Bounded space and time requirements] \hfill \\
		It must be possible to statically determine resource requirements.
\end{description}

\subsection{Example}
The following listing illustrates how annotations are used to specify the
contract of a subprogram.

\lstinputlisting[
	language=Ada,
	label=lst:spark-example,
	caption=SPARK example]
	{lst_spark}

The declaration of the \texttt{Exchange} procedure states that it has two
parameters X and Y which are of mode \texttt{in out}. This means that the values
of both parameters are imported (\texttt{in}) and exported (\texttt{out}).

Since the specification does not contain a \texttt{global} annotation, no other
state (i.e. global variables) is accessed and the procedure is therefore free of
side-effects. The \texttt{derives} annotation states the data flow: the value of
X is derived from Y and vice versa.

Lastly, the postcondition specifies the values of X and Y after the procedure
has been executed. An identifier decorated with the tilde symbol ($\sim$)
indicates the initial imported value of the variable. Thus the \texttt{post}
annotation states that X will be assigned the initial value of Y and likewise
for Y.

It should be noted that the \texttt{Color\_Type} is a \emph{distinct}
enumeration type which \emph{cannot} be mixed with other types.

An in-depth discussion of the SPARK programming language can be found in
\cite{BarnesSPARK}.

\subsection{SPARK 2014}
At the time of writing\footnote{August 2013}, the SPARK language is undergoing
a major transformation. The goal is to extend the subset of Ada included in
SPARK and to make use of the new Ada 2012 features \cite{Ada2012}. The use of
Ada 2012 aspects will replace the SPARK annotation comments. The official SPARK
2014 release is expected in the first quarter of 2014
\cite{SPARK2014:Announcement}.

Since the development of SPARK 2014 is currently ongoing, the Muen kernel is
implemented using the existing SPARK 2005 language and tools.

\input{back_intel}
\input{back_virt}

\section{Separation Kernel}
In a system with high requirements on security, functions needed to guarantee
these requirements must be isolated from the rest of the system and are called
the Trusted Computing Base (TCB)\index{TCB}. To be trusted, this code must be
kept sufficiently small and straightforward to allow formal verification of code
correctness. Lampson et al. \cite{Lampson:1991:ADS:121133.121160} define the TCB
of a computer system as:
\begin{quote}
	A small amount of software and hardware that security depends on and
	that we distinguish from a much larger amount that can misbehave without
	affecting security.
\end{quote}

Code that is part of the TCB is called \emph{trusted} while components that have
no relevance to the guarantee of security properties are called \emph{untrusted}.

A separation kernel\index{separation kernel} (SK\index{SK}) can be seen as the
fundamental part of a component-based system since its main purpose is to
enforce the separation of all software components while reducing the size of
the TCB. The concept of separation kernels was introduced by John Rushby in a
paper published in 1981 \cite{rushby1981}:

\begin{quote}
The task of a separation kernel is to create an environment which is
indistinguishable from that provided by a physically distributed system: it
must appear as if each regime is a separate, isolated machine and that
information can only flow from one machine to another along known external
communication lines.
\end{quote}

The separation kernel must therefore guarantee that the components can only
interact according to a well-defined policy\index{policy} while running on the
same physical hardware.

A system policy dictates the partitioning of hardware resources like
CPU\index{CPU}, memory or assignment of devices to components. The kernel
guarantees this isolation by emulating a suitable runtime environment for each
component, creating the impression of multiple (virtual) machines. By using
modern virtualization techniques, the kernel is able to delegate certain
management tasks to the hardware. This allows the separation kernel code to be
relatively simple, which is a precondition for formal verification of software.

Because of the simplicity requirement, a system running on top of a separation
kernel is relatively static. The system policy is compiled to a suitable format
at system integration time and cannot change during runtime. This is also the
main difference between the separation kernel concept and other forms of
microkernels which provide hardware abstraction layers, advanced mechanisms for
inter-process communication (IPC\index{IPC}) and dynamic resource management. A
separation kernel does not implement such functionality; its sole purpose is to
guarantee component separation according to a policy while maintaining a small
TCB in terms of size as well as complexity. Policy authors must make sure that
the system specification is sound and that only intended communication channels
are specified between components. Of course, this task can be simplified and
aided by applying appropriate support tools.

\subsection{Subjects}
The separation kernel isolates parts of the TCB into multiple
components\index{component} interacting via well-defined interfaces. In this
paper, such components are called \emph{subjects}\index{subject}.

As previously mentioned, only these features necessary to guarantee subject
separation are present in a separation kernel. Advanced features required to
isolate and virtualize a complex subject are implemented as dedicated
non-privileged subjects, so called \emph{subject monitors} (SM\index{SM}). A
complex subject could be a complete operating system like Linux\index{Linux}.

To allow more runtime flexibility, a separation kernel could also employ a
dedicated subject to offload management tasks. Such a subject runs in normal
unprivileged mode but is allowed to interact with the SK over a specialized
interface.

\subsubsection{Trust}
In the larger context of a component-based system, some subjects can be
considered to be part of the TCB and must therefore be trusted to operate
according to their specification. Such subjects must be developed with the same
diligence as the separation kernel itself.

As an example: a subject is in charge of data encryption and passes the
encrypted data to a second subject which provides network connectivity. A
security property could be that information must always be encrypted when
transmitted over the Internet. To achieve this, the encryption subject must
ensure that no unencrypted information flows to the network subject. Thus the
encryption subject must be considered part of the TCB and is called a
\emph{trusted} subject since its failure would break the security guarantee.

On the other hand, the network subject is not security critical as long as the
encryption subject works according to its specification. It is therefore not
part of the TCB, which means it is an \emph{untrusted} subject.

\section{Motivation}
As stated in the previous section, software must be simple in order to
establish trust in the correct functioning of the code, either through manual
review or formal verification. Even though the tools to automate formal
verification are progressing fast, the constraints on the software to be
verified are still severe. If the SLOC\footnote{Source lines of
code}\index{SLOC} count and thus complexity of a software project exceeds a
critical threshold\footnote{The threshold depends on the programming language
and support tools}, it is no longer possible to automatically prove integrity
and security requirements or to perform a manual review, because the required
effort is just too big.

This is the reason why common operating system kernels are not suitable for use
as a foundation for high assurance systems that employ a small TCB. The
Linux\index{Linux} kernel currently has over 14 million lines of code. While
functionality can be loaded as modules, Linux is still a
monolithic\index{monolithic} kernel at the architectural level, with the
complete code running in the same address space and privilege level. A
programming error in a device driver for example could compromise the integrity
of the entire system.

The motivation for this project is to provide a freely-available SK with a
permissive licensing\index{license} model to allow reviews and modification of
the source code and design documents. It should be possible to independently
reproduce verification artifacts such as automated proofs. In the eyes of the
authors, this approach should ultimately result in a separation kernel that is
applicable to environments where high integrity and security is demanded.

There have been many recent advances in hardware support for virtualization
that greatly simplify the design and implementation of a security kernel. The
authors believe that now is the perfect time to launch a separation kernel
project which benefits from these hardware capabilities.

\section{Goals}
The main goal of this project is to implement a freely-available, open-source
separation kernel. The kernel sources will be licensed under the GNU General
Public License (GPL\index{GPL}) \cite{gpl}.

SPARK is chosen as the main implementation language since this guarantees the
availability of advanced and approved tools to write high assurance code. The
SPARK tools will be utilized to show certain properties of the kernel code,
especially proof of absence of runtime errors is desired. The code of the kernel
must also be as small as reasonable to facilitate a lean TCB.

Operations not strictly required in the kernel should be implemented in a
special trusted subject written in SPARK. This subject is called $\tau$0 and is
considered an integral part of the Muen TCB. While keeping the kernel as simple
as possible, this will allow for later adoption of more dynamic resource
handling and scheduling mechanisms.

The target platform of the SK is 64-bit Intel. No abstraction layer will be
provided to support other processor architectures. Instead, the kernel shall
leverage the latest hardware features of the Intel platform to implement the
isolation while maintaining a small code base.

The main task of the SK is to separate subjects. This is why it must only allow
intended data flows according to the policy and it should prevent or limit
possible side- or covert-channels.

\section{Related Work}
This section presents related projects. While they share many similarities, the
main difference is that the Muen kernel aims to combine various approaches: use
of advanced Intel hardware virtualization features, Intel x86 64-bit target
platform, small code size, application of verification and proof techniques and
making the source code and documentation freely available.

\subsection{seL4}
seL4 is a microkernel\index{microkernel} of the L4
\cite{Liedtke:1996:TRM:234215.234473} family which has been formally verified
\cite{Klein_EHACDEEKNSTW_09}. It aims to provide high assurance of functional
correctness by means of machine-checked formal proofs. Using the theorem prover
Isabelle/HOL \cite{Nipkow-Paulson-Wenzel:2002}, it has been shown that the C
implementation correctly implements an abstract specification.

To our knowledge seL4 is the most advanced project in the field of formal
verification combined with operating system research. Unfortunately its
availability and use is rather limited by restrictive licensing. While parts of
the seL4 project are available for non-commercial use, the source code of the
kernel has not been published. Thus it is not possible to reproduce the formal
proofs.

\subsection{XtratuM}
XtratuM is a type 1 hypervisor\index{hypervisor} specially designed for
real-time embedded systems \cite{xtratum:2009a}. It is developed by the
Real-Time Systems group of the Polytechnic University of Valencia. It uses
para-virtualization to provide one or more virtual execution environments for
partitions. This means software running as partitions must be modified
accordingly to run on top of XtratuM.

The processor privilege mechanism (supervisor and user mode) is used to
separate the VMM from partitions.

The whole project is open-source and published under the GPL license. It is
implemented in C and Assembly. While it supports various platforms such as
LEON2/3 (PowerPC) and Intel x86, it does not support IA-32e mode.

\subsection{NOVA}
NOVA is a recursive acronym and stands for NOVA OS Virtualization Architecture
\cite{Steinberg:2010:NMS:1755913.1755935}. It applies microkernel construction
principles to create a virtualization environment. Its authors have coined
the term microhypervisor\index{microhypervisor} which is short for
microkernelized hypervisor.

NOVA has been developed from scratch with the goal to achieve a thin hypervisor
layer, a user-level virtual-machine monitor and additional unprivileged
components to reduce the attack surface on the most privileged code and thus
increase the overall system security. It runs on Intel and AMD x86 processors
with support for hardware virtualization and is implemented in the C++
programming language. The source code is publicly available \cite{NOVA} and has
been released under the GPL open-source license.

While NOVA is a very promising architecture it is not a SK and provides
insufficient temporal isolation. The choice of programming language (C++) and
its dynamic nature (in general a very desirable property) increases the
verification complexity to the point where it might be infeasible.

\subsection{Commercial Separation Kernels}\label{subsec:commercial-sks}
There are several commercial offerings for separation kernels by various
vendors:

\begin{itemize}
	\item INTEGRITY-178B by Green Hills Software, Inc. is a separation kernel
		EAL-6+ certified against the separation kernel protection profile
		\cite{SKPP}.
	\item PikeOS by SYSGO AG is a microkernel-based real-time OS.
	\item LynxSecure by LynuxWorks Inc. is a type 1 embedded hypervisor for the
		Intel x86 architecture.
	\item VxWorks MILS Platform by WindRiver Inc. is a type 1 hypervisor-based,
		SKPP-con\-for\-mant MILS separation kernel.
	\item PolyXene by Bertin Technologies, is a microkernel-based hypervisor
		EAL-5 certified against the Guest Operating System Hosting Framework
		protection profile.
\end{itemize}

None of these companies publish detailed technical documentation or provide
access to source code. Thus there is not enough information available for a
thorough technical analysis to assess the assurance provided by these kernels
(e.g. if the kernels are suitable for formal verification and what kind of
verification has been performed). Simply put, there is not enough information
to verify the high robustness claims made by the vendors.
